To create this blog I spent a few minutes looking up existing tools, but after getting lazy I decided to just write a quick script on my own. As always, it would definitely have been quicker to read up on some existing generator like Jekyll (ironically, Github pages already uses it when hosting this website) and it would also have been less janky— but then I wouldn't learn as much!
If you're already familiar with static sites, html, markdown, etc., you can skip to the technical details of my generator
Static sites are basically what a 'normal' website is: pre-written code that doesn't change once it's loaded up on a server and which is read in, executed by the browser, and rendered as some page.
Nowadays, many websites are dynamically generated. Often, this means that the actual raw HTML/source for the website is sometimes barebones and the rest of the site content is injected in by lots of JavaScript. This is how frameworks like React go about things. The key benefit is that you, for example, no longer need a blog.html
, index.html
, about.html
, etc.; instead, you can dynamically change the HTML of one page to pretend that there are many different views/pages. Similarly, you can change a lot based on user input, reuse code through components, and so on.
Although many of these features are nice, they can become bloat over a simple site that only has unchanging content. A blog, which is only occasionally updated with new content (a.k.a new HTML to be generated) doesn't really need a bunch of dynamic pages and JavaScript. Basic HTML and CSS can do what they shine at. In the end, the site loads faster, there's less JavaScript, and in theory less complexity overall.
To understand what a static site generator is, it is first important to note its use case, at least in the context of a blog.
In theory, since all of the code is pure HTML/CSS, it would be possible to just edit the source. That would mean that every time I wanted to write a blog post, I would have to sit down, open the website HTML in my code editor, and work inside a <p>
tag with the occasional <i>
or <h1>
. This kind of sucks, since ideally, you want to separate the end user experience (well, the only end user is me... but still) from the coding part and you also want there to be a somewhat decent UI.
On the other hand of the spectrum, there are websites like Medium and Substack that let you write in an appealing online GUI where you can see how your post will look while writing. Though nice, that goes beyond the territory of a static site generator or my skillset/time, so I decided on a middle ground.
Those nice GUIs often employ text entry using Markdown
, a simple markup language that is easy to read, write, and parse into viewable content. The static site generator I built takes in these markdown files and converts them into HTML pages (with CSS styling) that follow the structure of the original. The benefit of writing markdown over raw HTML is that the base file is already human readable and so it's much easier to write. For example, a sample heading and paragraph might look like
# Heading
To create this blog I spent a few minutes looking up existing tools, but after getting lazy I decided to just write a quick script on my ow- wait this is just this post!
Which is much nicer on the eye than a soup of HTML tags.
Now that we understand the starting point (a markdown document) and endpoint (an HTML page the browser understands) it is easy to see what the static site generator should do. Namely, get us from point A to B, markdown to HTML.
Converting the markdown source file into HTML that the browser can understand isn't too hard since markdown is a simple specification. Doubly so when there are plenty of existing parser libraries that let you disregard the inner workings of conversion. The reason that writing this entire generator isn't so trivial is because, on top of having a page, I want to have a nice page that links other posts, is styled to be readable, and so on. To accomplish this, the static site generator
posts/
)I used Python for the generator since anything else seems overkill. My script roughly follows the steps outlined above.
First, it finds all posts in a directory that contains the raw markdown. At first, I had it check if files were modified or new, but after this got annoying, I removed the code (I should probably fix it up and add it back later).
Next, the markdown is fed to a python markdown parsing library, marko
, and then written to the respective .html
file in a distribution directory. I am using post/
so that the URL looks nicer, for example blog.jacobryabinky.com/post/Hello World
.
Note: the reason that this URL works (no .html
) is because of the way that Github pages deploys the files.
To add on other parts of the site, such as the navigation bar, footer, and head
element, I decided to recreate components in a worse way! I think there is now some way of doing this better through something like web components, but I have never used or looked into them, so that could be part of an update to make my code less messy.
Right now, these elements are stored in a directory, elements/
, which contains subfolders for each element that contain the necessary HTML and CSS files. In the script, I find these files, read in the text, and append or insert it somewhere in the generated page (for example, nav bar at the top, footer at the bottom, etc.). Similarly, for CSS, I just append all of the component elements CSS files into the main index.css
style sheet and rely on class names for scoping styles.
The components are inserted into the newly generated blog post HTML files as well as the index.html
file serving as the blog's homepage using the BeautifulSoup4
library, which lets me parse the HTML tags and insert or append directly into the body
/a list/etc.
The last step is to actually deploy the pages and host them somewhere. Since I'm already uploading everything to Github, I decided to host off of Github pages (also because it's free). I created a Github workflow that runs the static site generator script every time I push to the posts
folder, in other words, every time there is a new post. Once the conversion is done, the workflow creates a new branch and pull request that I can then merge in to update the website.
The main reason my generator kind of sucks right now is that almost everything is hardcoded in, which works since I'm using it for just my blog site, but isn't great if I want to make changes or use it for other projects. The next steps will probably be to look into components and to abstract away some of these steps.